Psychometric evaluation and cross-cultural adaptation of the Australian Pelvic Floor Questionnaire (APFQ-IR) in Iranian reproductive age women

Pelvic floor disorders (PFDs), as a silent alert, is one of the pervasive debilitating health concerns among women all over the world, such that in developed countries, one in four women, suffers from PFDs. Validity and reliability of the Australian Pelvic Floor Questionnaire (APFQ) has not been determined in Iran, so to determine APFQ’s psychometric characteristics, we decided to conduct this study on women of reproductive age in Tabriz city, Iran. This methodological cross-sectional study was intended to determine the psychometric properties of the Persian version of the APFQ-IR in 5 steps including “translation process, content validity, face validity, construct validity (exploratory and confirmatory factor analyses and examination of ceiling and floor effects) and reliability” on 400 reproductive age women referring to health centers in Tabriz city, Iran, with cluster random sampling method in the period between May 2022 to September 2022. The translation process was done based on two approaches, Dual panel, and Beaton et al.’s five steps. Then, in order to evaluate content validity, face validity, and construct validity, 10 instrument and PFDs experts, 10 women from the target group investigated the instrument's items, and 400 eligible women completed the instrument. Finally, to determine the reliability, two internal consistency methods, (Cronbach's alpha and McDonald's omega) and test–retest method (ICC) were used. In the present study, content validity assessment of APFQ-IR, showed a good level of validity (CVR = 0.96, CVI = 0.94). To assess construct validity, exploratory factor analysis results on 36 items, led to the identification of 4 factors including bladder function, bowel function, prolapse symptom and sexual function, which explained 45.53% of the cumulative variance and indicated the sufficiency of the sample size (Kaiser–Meyer–Olkin = 0.750). Implementing confirmatory factor analysis, (RMSEA = 0.08, SRMR = 0.08, TLI = 0.90, CFI = 0.93, χ2/df = 3.52) confirmed the model fit indices. Finally the internal consistency and reliability was high for the entire instrument (Cronbach’s alpha = 0.85; McDonald's omega (95% CI) = 0.85 (0.83–0.87) and Intraclass Correlation Coefficient (95% CI) = 0.88 (0.74–0.94)). The Persian version of the APFQ-IR, has a good validity and reliability and has acceptable psychometric properties, thus can be used both for research purposes and for clinical evaluation of pelvic floor disorders symptoms in health centers.


Translation process
In order to carry out the translation process and determine the psychometric properties of the instrument, in the first step, a permission was requested from the main designers of the tool 33 .Then, to increase the accuracy of the translation process, the translation was done using two approaches, Dual panel (DP) and Beaton et al. 's five steps (Guidelines for the Process of Cross-Cultural Adaptation of Self-Report Measures).In the first approach (DP), the translation process is done in three steps 39 .The first panel (expert panel) consisted of 10 reproductive health, obstetrics and nursing education specialists.The second panel (layman panel) consisted of 10 eligible women.In the third stage (the target group panel), 400 eligible women of reproductive age completed the questionnaire in the presence of the researcher.In the second approach, according to Beaton et al. 's guidelines, the translation process was implemented through 5 stages including Stage I: Initial Translation, Stage II: Synthesis of The Translations, Stage III: Back Translation, Stage IV: Expert Committee, Stage VI: Pretesting 40 .
In the first stage, two translators (T1, T2) whose mother tongue was Farsi, performed the translation completely independently using the Forward-Translation method.The first translator was well learned in the field of PFDs and the second translator should preferably not have a medical or clinical background.This translator is called a naive translator and is more likely to recognize a different meaning of the original text than the first translator, and the translation she/he provides reflects the language used by that population and often highlights ambiguous meanings in the original questionnaire.Finally, two translators and a supervisor combined the results of the translations during sessions using the original questionnaire as well as the versions of the first translator (T1) and the second translator (T2) (production of a joint translation, T-12) 41 .
In the Backward-Translation stage, two other native English translators (BT1, BT2), who were completely blind to the original version of the questionnaire, re-translated the questionnaire into English using the T-12 version of the questionnaire.The translators also ensured that the final version of the questionnaire was comprehensible to a 12-year-old (approximately 6th grade reading level).The fourth stage is the expert committee's review.They reviewed all the translations (T1, T2, T12, BT1, and BT2) along with the written reports from four viewpoints of Semantic Equivalence, Idiomatic Equivalence, Experiential Equivalence, and Conceptual Equivalence.The final stage is the pre-test, which seeks to use the pre-final version in the target group.In the pre-test phase, the researcher provided the final translated version of the questionnaire to 30 women of reproductive age who meet the criteria in health centers.This was done to evaluate the clarity and comprehensibility of the final version for the target group and also to examine its internal consistency.After responding, women were asked again about their understanding of the questions, the level of difficulty, and the cultural appropriateness of the phrases.Participants in this stage were encouraged to provide feedback on all sections of the questionnaire so that the final Iranian version of the questionnaire would be more culturaly acceptable in the Iranian community 41 .

Content and face validity
After the final version of the questionnaire was prepared, in order to assess the content and face validity of AFPQ-IR, the content validity determination form was given to 10 reproductive health, obstetrics and nursing education specialists with expertise in the field of instruments and the field of PFDs and 10 eligible women.In the qualitative section, in order to measure the content validity of the AFPQ-IR, experts' opinions were received in terms of the overall structure of the questionnaire, the content of the items, Persian grammar and correct scoring and then, corrections were made.In addition, in the quantitative section, content validity ratio (CVR) and content validity index (CVI) were calculated 42 .In order to perform CVR, experts' opinions were received about each of the items of the instrument using a 3-point Likert scale in terms of the necessity (necessary, useful but unnecessary, and unnecessary) in the instrument.After obtaining the opinions of experts in the present study, according to the analysis of content validity by using Lawshe's CVR technique, the minimum acceptable value for 10 experts is 0.62.As a result, cases with CVR > 0.62 were kept 43 .After that, in order to determine CVI, experts were asked to determine the relevance, clarity and simplicity of the items using a 4-point Likert scale based on the Waltz and Bausell index 44 .CVI values vary between 0 and 1. Items with a CVI more than 0.79 were accepted 45 .Then, in order to determine the qualitative face validity, the items were assessed in terms of difficulty level, relevance and ambiguity by 10 eligible women (target group).In the quantitative face validity, the item impact method using a 5-point Likert scale from unimportant (1) to very important (5) was used to determine the impact score, and finally, the items with Impact score ≥ 1.5 were kept 46 .

Construct validity
Finally, construct validity was assessed using exploratory factor analysis (EFA), with Kaiser-Meyer Olkin (KMO) and Bartlett's test of sphericity criteria.Also, in order to determine the factors, principal component analysis method with varimax rotation (direct oblimin) was used.The amount of factor loading was considered above 0.3 47,48 .In confirmatory factor analysis (CFA), a series of indices such as the root mean square error of approximation (RMSEA < 0.08), standardized root mean square residual (SRMR < 0.10), normed Chi2 (χ 2 /df) < 5, comparative fit Indices including comparative fit index (CFI > 0.90) and Tucker-Lewis Index (TLI) > 0.90 were used to assess the fit of the model 49,50  www.nature.com/scientificreports/ the instrument, the floor and ceiling effect (F/C) was assessed, i.e. the samples with the highest and the lowest possible scores were judged whether the ceiling and floor effects are true about them or not, respectively.Based on a rule of thumb, the sample size for exploratory factor analysis is classified as 50 = very poor, 100 = poor, 200 = fair, 300 = good, 500 = very good and 1000 = excellent 51 .The number of samples for construct validity assessment in factor analysis is 5 to 10 samples for each instrument item.Therefore, by considering 5 samples in each item with design effect equal to 1.5 and considering 30% attrition, 400 eligible women who had referred to Tabriz health centers were selected using cluster sampling method.For sampling, a quarter of the centers were randomly selected using the website http:// www.random.org and the list of samples were selected based on SIB system (integrated health system).The inclusion criteria included all women regardless of having diagnosis of PFD or not, women of reproductive age (15-49 years), having sexual activity, monogamous husband and not having a known pregnancy at the time of the study.Women with a dementia, psychological disorders such as depression, intellectual disabilities, schizophrenia, addiction to drugs and/or alcohol, previous or current malignancy, being in 12 months after delivery, recent history of urinary tract infection (UTI), a history of gynecological surgery including reconstructive, cosmetic surgery and pelvic surgery, sexually transmitted diseases (STDs), and White Blood Cell (WBC) > 3 in the urine analysis test (U/A) and illiterate women were excluded from the study.
Then, after providing a comprehensive explanation about the research to the participants and receiving informed consent, the researcher provided them with socio-demographic and obstetric characteristics questionnaire and the Persian version of the Australian Pelvic Floor Questionnaire (APFQ-IR).Sociodemographic and obstetric characteristics included information such as age, body mass index (BMI), gravidity, parity, education level, occupation, income, smoking status, type of delivery, hysterectomy, prolapse surgery, and family history of PFDs.The APFQ questionnaire, was used to investigate PFDs.Higher scores indicate more severe pelvic floor disorders.Its validity (content validity, face validity, construct validity) and reliability were assessed in this study.This instrument was designed by Baessler et al. in Australia, and it contains 43 questions and is divided into four factors: BLF (Q1-15), BF (Q16-27), PS (Q28-32), and SF (Q33-43).The scoring is not based on the Likert scale.Most of the questions are scored from 0 to 3 using different descriptions such as Never, Occasionally, Frequently, and Daily to evaluate intensity/repetition, and Not at all, Slightly, Moderately, and Greatly to estimate bothersome symptoms.The scores in each area are calculated separately, divided by the number of questions in each field and then multiplied by 10.The overall score for each area is between 0 and 10, and the maximum score for PFDs is 40 52 .The higher the score, the more intense the PFDS.

Reliability
On the other hand, to determine the reliability of the questionnaire, test-retest reliability and internal consistency were used 53 .To determine the test-retest reliability, the questionnaire was completed by 30 eligible women of reproductive age who had referred to the health centers of Tabriz city by random sampling method in two stages with a time interval of two weeks.Internal consistency was also assessed by determining Cronbach's alpha coefficient and Mcdonald's omega coefficient for each factor and the whole instrument.Intra-class correlation coefficient (ICC) greater than 0.6 and Cronbach's alpha coefficient and Mcdonald's omega coefficient above 0.7 were considered favorable 54 .

Ethical consideration
The present study was approved by the Ethics Committee of Tabriz University of Medical Sciences (Ethics code: IR.TBZMED.REC.1400.1073).All ethical principles, including obtaining necessary permission from the main designers of the instrument (Baessler et al.), obtaining written informed consent from all participants, ensuring the confidentiality of their information, and freedom to withdraw from the study were observed at every step.

Statistical analysis
SPSS Statistics 14 (IBM Corp, Armonk, NY, USA) and STATA 14 (Statcorp, college station, Texas, USA) and R software 4.2 (Psych package) were used for data analysis.In this study, the socio-demographic and obstetric characteristics, content validity, face validity, construct validity and reliability were determined respectively through Mean (SD) for quantitative variables and frequencies (percentages) for qualitative variables, CVR and CVI, Impact score, EFA and CFA and finally, Cronbach's alpha coefficient, McDonald's omega and ICC were assessed.

Ethics approval and consent to participate
The current study was approved by the Ethics Committee of Tabriz University of Medical Sciences [ref: IR.TBZMED.REC.1400.1073].Written informed consent to participate in the study was obtained from all the participants before enrolment.All methods were performed in accordance with the Declaration of Helsinki.

Results
400 women of reproductive age with a mean (SD) age of 34.4 (7.2) (range 16-49) were included in this study between May 2022 and September 2022, with cluster sampling method.The mean (SD) of body mass index was 26.9 (4.1) and more than three quarters of them (82.3%) were housewives.Other socio-demographic and obstetric characteristics of the participants are summarized in Table 1.
Assessing the content validity of the tool, CVI, CVR were obtained as 0.94 and 0.96, respectively, which indicates the good reliability of the instrument.But on the other hand, question 36 of the instrument (vaginal www.nature.com/scientificreports/sensation during intercourse) (SF4) was corrected because it had a CVI = 0.58.Moreover, in the face validity review, all the items were described as appropriate and without ambiguity and difficulty and received a minimum score of 1.5.The details of the results of content and face validity assessment are shown in Table 2.
In the construct validity assessment, the total number of questions in the original version of the instrument was 43, but due to the difference in the way of answering and missing (because only sexually inactive women had to answer them), 3 questions (SF1-SF3) were not included in the exploratory factor analysis.Finally, exploratory factor analysis was performed only on 40 items, which led to the extraction of 4 factors that explained 45.53% of the variance.During the process of exploratory factor analysis, questions 17 (BOF2), 24 (BOF9), 28 (PS1) and 41 (SF9) were also removed due to factor loading less than 0.3 and finally the number of questions was reduced from 43 to 36 questions (Fig. 1).
The four factors extracted during exploratory factor analysis are: The first factor was BLF, which includes 15 questions, accounting for 14.99% of the total variance.The second factor is BOF and has 10 questions, which explains 11.90% of the total variance.Finally, the third and fourth factors are PS, with 4 questions, and SF, with 7 questions, explaining 8.75% and 9.88% of the total variance, respectively (Table 3).The results indicating the adequacy of the sample size (Kaiser-Meyer-Olkin = 0.750) were obtained at a significance level of less than 0.001.Also, the result of Bartlett's sphericity test was significant, which indicated the acceptable performance of factor analysis according to the correlation matrix in the studied sample (P ≤ 0.001) (Table 4).
Finally, to determine the reliability of the tool, (Cronbach's alpha = 0.85; McDonald's omega (95% CI) = 0.85 (0.83-0.87) and Intraclass Correlation Coefficient (95% CI) = 0.88 (0.74-0.94)), were obtained, showing a good www.nature.com/scientificreports/instrument reliability.Moreover, ceiling effect was not observed in the overall value and sub-domains, but the floor effect in the overall score (APFQ-IR) was equal to 8.5% and the values of the sub-domains are detailed in Table 6.

Discussion
Pelvic floor disorders (PFDs) are a significant health problem among women living in low and middle-income countries.This problem exists because many women with PFDs, due to misconceptions and lack of awareness of the existence of treatment options, fear of discrimination, feeling of shame and society's culture, hide their problem 55,56 .Therefore, the existence of a reliable instrument to measure the symptoms of PFDs, seems necessary.This study, aiming to psychometrically evaluate the APFQ among Iranian women, indicates that the Persian At the present time, despite the design of numerous questionnaires to evaluate PFDS with emphasis on the domains of urinary incontinence 57,58 , fecal incontinence 59 and some for pelvic organ prolapse 60 , there are only a few valid questionnaires that cover all domains (bladder, bowel, prolapse and sexual domains), merged together.Among them, the ICI questionnaires (www.iciq.net), despite having strong criteria for assessment, are not designed to be used in clinical practice 61 .
Although the Pelvic Floor Distress Inventory-20 (PFDI-20) questionnaire, and the Pelvic Floor Impact questionnaire (PFIQ-7) are recommended by ICI with grade B 60,62 , these questionnaires (long and short form), are not designed to be used in routine urogenycological operations, since they are aimed at measuring the intensity and frequency of symptoms (never, occasionally, frequently, etc.), not at specifically evaluating sexual function.Therefore, these questionnares, alone are not useful in clinical practice.The only questionnaire that integrates all areas is The Global Pelvic Floor Bother Questionnaire, which has 9 questions, but one of its disadvantages is the lack of dedicated sections for each area and the allocation of only one question (question 9) to sexual activity 63 .
During the process of exploratory factor analysis in this study, 4 factors were extracted for 36 questions of the questionnaire, including BLF, BOF, PS and SF, which explained about 45.53% of the variance.As a part of the instrument's validity assessment, the value of KMO and the significance of Bartlett's test was also assessed which confirmed the adequacy of the model.Although the psychometric properties of APFQ have been investigated in several languages worldwide, the construct validity has only been examined in Arabic and Spanish versions.The first study was conducted on the Arabic version (2021) by Malaekah et al., who identified four factors explaining 36.64% of the variance.The results of exploratory analysis showed KMO = 0.806 and Bartlett sphericity test = 4150.46 64.The second study was performed on the Spanish version (2022) by Molina-Torres et al., who identified two factors explaining 31.26% of the variance.The results of exploratory factor analysis showed KMO = 0.858, with a significant value in the Bartlett sphericity test (P < 0.001) 37 .The factors extracted during exploratory factor analysis are parallel and in line with the factors reported in Pelvic Floor Distress Inventory-20 (PFDI-20) and Pelvic Floor Impact Questionnaire (PFIQ-7) measures 62 , with the difference that the APFQ questionnaire, having a sexual performance section, is more complete than the two questionnaires mentioned.
Despite the ICC's emphasis on the use of these two questionnaires, APFQ has some advantages over them.The advantage and strength of the APFQ questionnaire compared to other questionnaires is the existence of a separate section for examining women's sexual performance, the answering method which focuses on measuring   the frequency and severity of symptoms, level of bother and issues related to the quality of life in these special conditions.In order to determine internal consistency, Cronbach's alpha coefficient for the APFQ-IR and its range for the subscales were obtained as 0.85, 0.68-0.78,which in the reported values for these parameters in original scale were 0.74-1.00,almost within the range of the original scale 33 .The obtained results were also similar to the Turkish version (0.73-0.86) 35 , and Serbian (0.82-0.84) 36 , but compared to the Spanish version (0.83-0.93) 37 and Chinese (0.83-0.89) 34 were smaller.Moreover, to determine the reliability of this study, the ICC was obtained 0.77-0.95,which is in accordance with the range of the original instrument (0.74-1.00) 33 and higher than the Chinese version (0.22-0.88) 34 and the Spanish version (0.59-0.96) 37 .
Because of the strong role of women in the family's and society's health, the high prevalence of the PFDs and the wide range of complications caused by this situation is considered as a silent alarm about reducing the QOL and interfering with the roles of women in society.As a result, the importance of a dedicated instrument to evaluate the symptoms of PFDs by health care providers in health centers, and subsequently, training, diagnostics and therapeutic measures, becomes more prominent.

Strength and limitation
Some of the strengths of this study are: Investigating the psychometric properties of APFQ-IR for the first time in Iran, the comprehensiveness of the tool and coverage of all areas of PFDs, especially the sexual function section, the way of responding with an emphasis on measuring the severity and frequency of symptoms compared to other tools in this field, using the combination of the dual panel method and the five stages method for the purpose of the translation process and finally providing the possibility of comparison with other versions from other countries.Using the same set of data in order to perform exploratory and confirmatory factor analysis, not including general quality of life questions in the questionnaire, and the smaller number of women respondents under the sexual function scale due to sexual inactivity are the limitations of the present study.

Conclusion
The Persian version of the questionnaire, APFQ-IR, has a good validity and reliability and has acceptable psychometric properties, thus can be used both for research purposes and for clinical evaluation of PFDs symptoms in health centers.Thus, health policy makers should do their best to design special programs for screening, training, diagnosis and treatment measures in order to evaluate and follow up women with PFDs with the aim of improving their performance and QOL. https://doi.org/10.1038/s41598-023-50417-5 . Finally, after factor analysis and removal of inappropriate items from

Table 1 .
Socio-demographic and obstetric characteristics of participants for factor analysis of APFQ-IR (n = 400).SD standard deviation, BMI body mass index, NVD normal vaginal delivery, C/S ceasarean section.

Table 2 .
The results for the content and face validity of the Iranian version of APFQ-IR (n = 10).CVI content validity index, CVR content validity ratio.
Factor structure model of the APFQ-IR based on CFA.(All factor loadings are significant at P < 0.001).BLF bladder function, BOF bowel function, PS prolapse symptom, SF sexual function.

Table 3 .
Result of factor analysis of the APFQ-IR scale based on EFA (n = 400).

Table 4 .
KMO and Bartlett's test of the Iranian version of APFQ-IR (n = 400).KMO Kaiser-Meyer Olkin, df degree of freedom.

Table 6 .
Reliability statistics, floor and celling effect of the APFQ-IR.ICC intra class correlation coefficient, CI confidence interval, BLF bladder function, BOF bowel function, PS prolapse symptom, SF sexual function, APFQ Australian Pelvic Floor Questionnaire.